Query capabilities of the Karma provenance framework
نویسندگان
چکیده
Provenance metadata in e-Science captures the derivation history of data products generated from scientific workflows. Provenance forms a glue linking workflow execution with associated data products, and finds use in determining the quality of derived data, tracking resource usage, and for verifying and validating scientific experiments. In this article, we discuss the scope of provenance collected in the Karma provenance framework used in the LEAD Cyberinfrastructure project, distinguishing provenance metadata from generic annotations. We further describe our approaches to querying for different forms of provenance in Karma while addressing the queries proposed in the First Provenance Challenge. We use an incremental, building-block method to construct provenance queries based on the fundamental querying capabilities provided by the Karma service centered on the provenance data model. This has the advantage of keeping the Karma service generic and simple, and yet supports a wide range of queries. Karma successfully answers all but one Challenge query. Copyright c © 2007 John Wiley & Sons, Ltd.
منابع مشابه
Performance Evaluation of the Karma Provenance Framework for Scientific Workflows
Provenance about workflow executions and data derivations in scientific applications help estimate data quality, track resources, and validate in silico experiments. The Karma provenance framework provides a means to collect workflow, process, and data provenance from data-driven scientific workflows and is used in the Linked Environments for Atmospheric Discovery (LEAD) project. This paper pre...
متن کاملModel of Karma Version 3
Provenance that captures e-Science activity has long term value only if the right amount and kind of information is collected. In this paper, we propose a two-layer model for representing provenance information capable of representing both execution information and higher level process details. The information model forms the basis for efficient relational database storage and query, and sets t...
متن کاملChoosing a Data Model and Query Language for Provenance
The ancestry relationships found in provenance form a directed graph. Many provenance queries require traversal of this graph. The data and query models for provenance should directly and naturally address this graph-centric nature of provenance. To that end, we set out the requirements for a provenance data and query model and discuss why the common solutions (relational, XML, RDF) fall short....
متن کاملGProM - A Swiss Army Knife for Your Provenance Needs
We present an overview of GProM, a generic provenance middleware for relational databases. The system supports diverse provenance and annotation management tasks through query instrumentation, i.e., compiling a declarative frontend language with provenance-specific features into the query language of a backend database system. In addition to introducing GProM, we also discuss research contribut...
متن کاملPrOM: A Semantic Web Framework for Provenance Management in Science
The eScience paradigm is enabling researchers to collaborate over the Web in virtual laboratories and conduct experiments on an industrial scale. But, the inherent variability in the quality and trust associated with eScience resources necessitates the use of provenance information describing the origin of an entity. Existing systems often model provenance using ambiguous terminology, have poor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Concurrency and Computation: Practice and Experience
دوره 20 شماره
صفحات -
تاریخ انتشار 2008